NB: The worksheet has beed developed and prepared by Maxim Romanov for the course “R for Historical Research” (U Vienna, Spring 2019); Code snippets, data, and descriptions reused from: 1) https://kateto.net/network-visualization; 2) http://doi.org/10.5281/zenodo.1411479 (modified).

1 Social Network Analysis with R (02)

2 Libraries

The following are the librsries that we will need for this section. Install those that you do not have yet.

# General ones 
library(tidyverse)
library(readr)
library(stringr)
library(RColorBrewer)
library(dplyr)

# SNA Specific
library(igraph)
library(ggraph)
library(ggrepel)
library(ggalt)
library(visNetwork)

NB: igraph is the main SNA library. There are other packages we can use to visualize graphs (ggraph, for example), but most calculations and the overall analysis is still to be performed with igraph. You can get an overview of its contents with library(help="igraph"). More information: https://igraph.org/r/; full documentation: https://igraph.org/r/doc/igraph.pdf

3 Data

Our practice datasets will include:

4 Star Wars Network

4.1 Edges

Let’s start with loading our Star Wars data and then creating and upgrading our igraph object.

#library(tidyverse)
#library(readr)

sw_e_all <- read_delim("sw_network_edges_allCharacters.csv",
                       "\t", escape_double = FALSE, trim_ws = TRUE)
## Parsed with column specification:
## cols(
##   source = col_character(),
##   target = col_character(),
##   weight = col_double(),
##   episode = col_character()
## )
sw_e_all <- sw_e_all %>%
  mutate(episodeD = as.numeric(gsub("episode-", "", episode)))

sw_e_total <- sw_e_all %>%
  group_by(source, target) %>%
  summarize(weight=sum(weight))

head(sw_e_all)
sw_e_prequel <- sw_e_all %>%
  filter(episodeD <= 3) %>%
  group_by(source, target) %>%
  summarize(weight=sum(weight))

head(sw_e_prequel)
  1. Get edges for the original trilogy.
sw_e_original <- sw_e_all %>%
  filter(episodeD >= 3 & episodeD <= 6) %>%
  group_by(source, target) %>%
  summarize(weight=sum(weight))

head(sw_e_original, 50)
  1. Load edges where characters with double personalities are split (three characters: Anakin Skywalker and Darth Vader, Palpatine and Darth Sidious, Count Dooku and Darth Tiranus).
sw_e_all_copy1 <- sw_e_all

# case_when works with mutate to find observations and replace them with another.
# in this case below it replaces observation from the source and target variable.
# the final statement TRUE ~ source or TRUE ~ TARGET siply reuses the data from source or target variable 

sw_double_persona <- sw_e_all_copy1 %>%               
  mutate(sourceSplit = case_when(
    source == "ANAKIN (DW)" ~ "Darth Vader",
    source == "COUNT DOOKU (DT)" ~ "Darth Tiranus",
    TRUE ~ source
  )) %>%
  mutate(targetSplit = case_when(
    target == "PALPATINE (DS)" ~ "Darth Sidious",
    TRUE ~ target
  )) %>%
  select(sourceSplit,targetSplit,episode,episodeD,weight)

sw_double_persona
# less elegeant option without case_when only allows to create three different datasets.

sw_e_all_copy <- sw_e_all

sw_e_split_anakin <- sw_e_all %>%
  mutate(sourceSplit = as.character(gsub("ANAKIN", "Darth Vader", source))) %>% 
  mutate(targetSplit = target) %>%
  group_by(source, sourceSplit, target, targetSplit) %>%
  summarize(weight=sum(weight))

sw_e_split_dooku <- sw_e_all %>%
  mutate(sourceSplit = as.character(gsub("COUNT DOOKU", "Darth Tiranus", source))) %>%
  mutate(targetSplit = target) %>%  
  group_by(source, sourceSplit, target, targetSplit) %>%
  summarize(weight=sum(weight))

sw_e_split_palpatine <- sw_e_all %>%
  mutate(targetSplit = as.character(gsub("PALPATINE", "Darth Sidious", target))) %>%
  mutate(sourceSplit = source) %>%    
  group_by(source, sourceSplit, targetSplit, target) %>%
  summarize(weight=sum(weight))



sw_e_split_anakin <- sw_e_split_anakin %>%
  filter(source == "ANAKIN (DW)")

sw_e_split_dooku <- sw_e_split_dooku %>%
  filter(source == "COUNT DOOKU (DT)")

sw_e_split_palpatine <- sw_e_split_palpatine %>%
  filter(target == "PALPATINE (DS)")

sw_e_split_anakin
sw_e_split_dooku
sw_e_split_palpatine

4.2 Nodes

We will need to generate two separate list on nodes from our loaded data:

sw_n_prequel <- as.data.frame(unique(c(sw_e_prequel$source,sw_e_prequel$target)))
colnames(sw_n_prequel) <- "ID"
head(sw_n_prequel)
  1. Generate the list of nodes for the original trilogy.
sw_e_original <- sw_e_all %>%
  filter(episodeD >= 4 & episodeD <= 6)

sw_n_original <- as.data.frame(unique(c(sw_e_original$source, sw_e_original$target)))
colnames(sw_n_original) <- "ID"
head(sw_n_original)

Using our file with nodes, we can join additional data to our lists of nodes.

sw_n_data <- read_delim("sw_network_nodes_allCharacters.csv", "\t", escape_double = FALSE, trim_ws = TRUE)

sw_n_prequel <- sw_n_prequel %>%
  left_join(sw_n_data, by="ID")
head(sw_n_prequel, 50)
  1. Repeat this operation for the original trilogy.
sw_n_data <- read_delim("sw_network_nodes_allCharacters.csv", "\t", escape_double = FALSE, trim_ws = TRUE)
## Parsed with column specification:
## cols(
##   ID = col_character(),
##   LABEL = col_character(),
##   AFFILIATIONS = col_character(),
##   SIDE = col_character()
## )
sw_n_original <- sw_n_original %>%
  left_join(sw_n_data, by="ID")
## Warning: Column `ID` joining factor and character vector, coercing into
## character vector
head(sw_n_original)

Keep in mind that you can always tweak edges and nodes data depending on what you what you are trying to glean out from your network. (NB: for this you may want to create copies of the original EDGES and NODES files and edit them in Excel or other spreadsheet program). In our current example, for instance, you may want to add more data into AFFILIATION (or create columns with binary divisions, like the Rebel Alliance versus the Empire, etc.); perhaps, to color specific nodes with distinct colors (for example, we may want to use the same colors for the split personalities like Anakin and Darth Vader). You can also add as many additional columns as you see fit (for example, homeworlds).

4.3 Graph / Network Object

Now that we have both necessary components, we can start with our analysis. First, we need to create a graph object, using igraph library.

#library(igraph)

sw_network_prequel_bare <- graph_from_data_frame(d=sw_e_prequel, vertices=sw_n_prequel, directed=F)

You can check the contents of the igraph object in the following manner:

  1. for the nodes:
vertex_attr(sw_network_prequel_bare)
## $name
##  [1] "ANAKIN (DW)"          "BAIL ORGANA"          "BERU"                
##  [4] "BOBA FETT"            "BOSS NASS"            "BRAVO THREE"         
##  [7] "BRAVO TWO"            "C-3PO"                "CAPTAIN ANTILLES"    
## [10] "CAPTAIN PANAKA"       "CAPTAIN TYPHO"        "CLIEGG"              
## [13] "CLONE COMMANDER CODY" "CLONE COMMANDER GREE" "COUNT DOOKU (DT)"    
## [16] "DARTH MAUL"           "DOFINE"               "FANG ZAR"            
## [19] "FODE/BEED"            "GENERAL CEEL"         "GENERAL GRIEVOUS"    
## [22] "GIDDEAN DANU"         "GREEDO"               "JABBA"               
## [25] "JANGO FETT"           "JAR JAR"              "JIRA"                
## [28] "JOBAL"                "KI-ADI-MUNDI"         "KITSTER"             
## [31] "LAMA SU"              "MACE WINDU"           "MON MOTHMA"          
## [34] "NUTE GUNRAY"          "OBI-WAN"              "ORN FREE TAA"        
## [37] "OWEN"                 "PADME"                "PK-4"                
## [40] "POGGLE"               "QUI-GON"              "R2-D2"               
## [43] "RABE"                 "RUNE"                 "RUWEE"               
## [46] "SEBULBA"              "SENATOR ASK AAK"      "SIO BIBBLE"          
## [49] "VALORUM"              "YODA"                 "ODD BALL"            
## [52] "PALPATINE (DS)"       "RIC OLIE"             "SHMI"                
## [55] "SOLA"                 "SUN RIT"              "WALD"                
## [58] "WATTO"                "TAUN WE"              "TC-14"               
## [61] "TEY HOW"              "TARPALS"              "PLO KOON"            
## [64] "TION MEDON"          
## 
## $LABEL
##  [1] "ANAKIN (DW)"      "BAIL ORGANA"      "BERU"            
##  [4] "BOBA FETT"        "BOSS NASS"        "BRAVO 3"         
##  [7] "BRAVO 2"          "C-3PO"            "Cpt.ANTILLES"    
## [10] "Cpt.PANAKA"       "Cpt.TYPHO"        "CLIEGG"          
## [13] "Cd.CODY"          "Cd.GREE"          "COUNT DOOKU (DT)"
## [16] "DARTH MAUL"       "DOFINE"           "FANG ZAR"        
## [19] "FODE/BEED"        "Gen.CEEL"         "Gen.GRIEVOUS"    
## [22] "GIDDEAN DANU"     "GREEDO"           "JABBA"           
## [25] "JANGO FETT"       "JAR JAR"          "JIRA"            
## [28] "JOBAL"            "KI-ADI-MUNDI"     "KITSTER"         
## [31] "LAMA SU"          "MACE WINDU"       "MON MOTHMA"      
## [34] "NUTE GUNRAY"      "OBI-WAN"          "ORN FREE TAA"    
## [37] "OWEN"             "PADME"            "PK-4"            
## [40] "POGGLE"           "QUI-GON"          "R2-D2"           
## [43] "RABE"             "RUNE"             "RUWEE"           
## [46] "SEBULBA"          "Sen. ASK AAK"     "SIO BIBBLE"      
## [49] "VALORUM"          "YODA"             "ODD BALL"        
## [52] "PALPATINE (DS)"   "RIC OLIE"         "SHMI"            
## [55] "SOLA"             "SUN RIT"          "WALD"            
## [58] "WATTO"            "TAUN WE"          "TC-14"           
## [61] "TEY HOW"          "TARPALS"          "PLO KOON"        
## [64] "TION MEDON"      
## 
## $AFFILIATIONS
##  [1] "Jedi Order, Old Republic, Empire, Sith Order"                  
##  [2] "Old Republic, Rebel Alliance"                                  
##  [3] "Civilians, Outer Rim"                                          
##  [4] "Bounty Hunters"                                                
##  [5] "Old Republic, Naboo, Leader"                                   
##  [6] "Military, Naboo, Old Republic"                                 
##  [7] "Military, Naboo, Old Republic"                                 
##  [8] "Old Republic, Droids, Rebel Alliance, New Republic, Resistance"
##  [9] "Rebel Alliance, Rogue Squadron, New Republic"                  
## [10] "Military, Naboo, Old Republic"                                 
## [11] "Military, Naboo, Old Republic"                                 
## [12] "Civilians, Outer Rim"                                          
## [13] "Military, Old Republic"                                        
## [14] "Military, Old Republic"                                        
## [15] "Jedi Order, Old Republic, Sith Order, Separatists"             
## [16] "Sith Order"                                                    
## [17] "Trade Federation, Old Republis, Separatists"                   
## [18] "Old Republic, Senate"                                          
## [19] "Civilians, Outer Rim"                                          
## [20] "Military, Naboo, Old Republic"                                 
## [21] "Separatists, Old Republic"                                     
## [22] "Old Republic, Senate"                                          
## [23] "Bounty Hunter"                                                 
## [24] "Hutts, Criminals"                                              
## [25] "Bounty Hunters"                                                
## [26] "Naboo, Old Republic, Senate"                                   
## [27] "Civilians, Outer Rim"                                          
## [28] "Civilians, Naboo"                                              
## [29] "Jedi Order, Old Republic"                                      
## [30] "Civilians, Outer Rim"                                          
## [31] "Kamino, Administrator"                                         
## [32] "Jedi Order, Old Republic"                                      
## [33] "Old Republic, New Republic, Senate, Rebel Alliance, Resistance"
## [34] "Separatists, Old Republic, Trade Federation"                   
## [35] "Jedi Order, Old Republic, Rebel Alliance"                      
## [36] "Old Republic, Empire, Senate"                                  
## [37] "Civilians, Outer Rim"                                          
## [38] "Old Republic, Senate"                                          
## [39] "Droids"                                                        
## [40] "Separatists, Techno Union"                                     
## [41] "Jedi Order, Old Republic"                                      
## [42] "Old Rebuplic, New Republic, Rebel Alliance, Resistance, Droids"
## [43] "Old Republic, Naboo"                                           
## [44] "Trade Federation, Separatists"                                 
## [45] "Civilians, Naboo"                                              
## [46] "Outer Rim, Civilians"                                          
## [47] "Old Republic, Senate"                                          
## [48] "Old Republic, Administrator, Naboo"                            
## [49] "Old Republic, Senate"                                          
## [50] "Jedi Order, Old Republic"                                      
## [51] "Old Republic, Military, Clones"                                
## [52] "Old Republic, Senate, Sith Order, Empire"                      
## [53] "Military, Naboo, Old Republic"                                 
## [54] "Civilians, Outer Rim"                                          
## [55] "Civilians, Naboo"                                              
## [56] "Military, Separatists"                                         
## [57] "Civilians, Outer Rim"                                          
## [58] "Civilians, Outer Rim"                                          
## [59] "Administrator, Kamino"                                         
## [60] "Droids, Trade Federation"                                      
## [61] "Trade Federation, Separatists"                                 
## [62] "Military, Naboo, Old Republic"                                 
## [63] "Jedi Order, Old Republic"                                      
## [64] "Old Republic, Utapau, Administrator"                           
## 
## $SIDE
##  [1] "anakin"  "light"   "neutral" "neutral" "light"   "light"   "light"  
##  [8] "neutral" "light"   "light"   "light"   "neutral" "light"   "light"  
## [15] "dark"    "dark"    "dark"    "light"   "neutral" "light"   "dark"   
## [22] "light"   "neutral" "neutral" "neutral" "light"   "neutral" "neutral"
## [29] "light"   "neutral" "neutral" "light"   "light"   "dark"    "light"  
## [36] "neutral" "neutral" "light"   "neutral" "dark"    "light"   "neutral"
## [43] "neutral" "dark"    "neutral" "neutral" "light"   "neutral" "light"  
## [50] "light"   "light"   "dark"    "neutral" "neutral" "neutral" "dark"   
## [57] "neutral" "neutral" "neutral" "dark"    "dark"    "light"   "light"  
## [64] "neutral"
  1. for the edges:
edge_attr(sw_network_prequel_bare)
## $weight
##   [1]  2  1  1  1  2 10  2  3  2  1  3  1  1  1 12  2  1  2  3  9  1 46  2
##  [24]  3 41 15  1 22 32  1  4  3  2  8  1  1  1  2  5  9  3  2  2  1  1  1
##  [47]  2  2  8  5  6  1  4  1  1  8  1  1  1  1  3  1  2  1  2  2  2  2  2
##  [70]  2  4  1  2  1  1  3  2 12  2  1 10  1  1  1  7  7  7  3  9  2  3  1
##  [93]  2  2  2  2  1  2  1  1  1  2  1  3  3  2  1  1  2  1  3  3  1  1  1
## [116]  1  1  2  2  1  2  1  1  1  1  1  2  1  1  1  1  1  1  1  1  1  1  1
## [139]  1  1 15  1 10  5 22  1  2  1  3  1  2  1  1  1  1  2  1  1  2  4  1
## [162]  1  1  2  1  4  1  2  2  1  1  1  6  2  9  2  3  2  1  1 16  2  1  2
## [185]  6  2  1  8  2  2  1  1  2  9  5  1  1 26 12  3  1  1  1  1  5  1  1
## [208] 19  2  1  4  5  1 16 22  1  2  5  1  1  1  3  1  1  2  1 14  1  2  2
## [231]  8  2  1  1  2  6  3  2  3  1  1  3  2  1  1  2  1  1  2  2
  1. Create an igraph object for the original trilogy
sw_network_original_bare <- graph_from_data_frame(d=sw_e_original, vertices=sw_n_original, directed=F)

4.4 Coloring nodes

4.4.1 Approach 1

Let’s add some additional information. Below is another example how we can assign colors in a bit more efficient manner.

sw_network <- sw_network_prequel_bare

newColors <- V(sw_network)$SIDE

newColors <- replace(newColors, newColors=="light", "gold")
newColors <- replace(newColors, newColors=="dark", "blue")
newColors <- replace(newColors, newColors=="neutral", "lightblue")
newColors <- replace(newColors, newColors=="anakin", "gold")

V(sw_network)$color <- newColors
V(sw_network)$color
##  [1] "gold"      "gold"      "lightblue" "lightblue" "gold"     
##  [6] "gold"      "gold"      "lightblue" "gold"      "gold"     
## [11] "gold"      "lightblue" "gold"      "gold"      "blue"     
## [16] "blue"      "blue"      "gold"      "lightblue" "gold"     
## [21] "blue"      "gold"      "lightblue" "lightblue" "lightblue"
## [26] "gold"      "lightblue" "lightblue" "gold"      "lightblue"
## [31] "lightblue" "gold"      "gold"      "blue"      "gold"     
## [36] "lightblue" "lightblue" "gold"      "lightblue" "blue"     
## [41] "gold"      "lightblue" "lightblue" "blue"      "lightblue"
## [46] "lightblue" "gold"      "lightblue" "gold"      "gold"     
## [51] "gold"      "blue"      "lightblue" "lightblue" "lightblue"
## [56] "blue"      "lightblue" "lightblue" "lightblue" "blue"     
## [61] "blue"      "gold"      "gold"      "lightblue"

NB: As you have noticed, graphs do not always come out readable. To fix that, you can add parameters for width and height, as you can see in the code chunk parameters below ({r fig.height=15, fig.width=22}):

set.seed(1)
plot(sw_network, vertex.size=5, label.cex=10)

NB: with the following code you can save a PDF of your graph. You can change width= and height= to get a usable graph of your network.

pdf(file="practiceGraph_swNetwork.pdf", width=30, height=20)
set.seed(1)
plot(sw_network, vertex.size=2, label.cex=10)
dev.off()
## png 
##   2

4.4.2 Approach 2

We can use the code below to assign colors based on affiliations. However, since values are listed in an untidy manner, we may want to apply a different approach. Namely, we can color our nodes in a binary mode: if a there is a certain affiliation in AFFILIATIONS, we assign color X, if not—we assign color Y.

In the example below, ifelse() statement is used to check whether a condition is true or not. str_detect() (from library stringr) is used to check is “Rebel Alliance” occurs in AFFILIATIONS value.

aff <- data.frame(V(sw_network)$AFFILIATIONS)
aff <- cbind(aff, V(sw_network)$AFFILIATIONS)
colnames(aff) <- c("AFFILIATIONS", "COLORS")

library(stringr)
# do not forget that capital letters and small letters are different characters!
aff$COLORS <- ifelse(str_detect(aff$COLORS, "Separatists"), "red", "white")

V(sw_network)$color <-aff$COLORS

set.seed(1)
plot(sw_network, layout=layout_with_fr, vertex.size=5)

# now let's assign the same color to all nodes
V(sw_network)$color <- "orange"
  1. Generate graph with characters from the Outer Rim territories highlighted with a distinct color.
aff2 <- data.frame(V(sw_network)$AFFILIATIONS)
aff2 <- cbind(aff2, V(sw_network)$AFFILIATIONS)
colnames(aff2) <- c("AFFILIATIONS", "COLORS")

library(stringr)
aff2$COLORS <- ifelse(str_detect(aff2$COLORS, "Outer Rim"), "green", "white")

V(sw_network)$color <-aff2$COLORS

set.seed(2)
plot(sw_network, layout=layout_with_fr, vertex.size=5)

  1. Color nodes for the original trilogy and generate a graph. Explain how you color the nodes.
# renaming the dataset
sw_network_o <- sw_network_original_bare

#creating a new variable that holds the variable (column) "SIDE" of the vertices (nodes) of the igraph object
newColors <- V(sw_network_o)$SIDE

#creating new colors by replacing old ones with the "replace" function and saving them as a new variable
newColors <- replace(newColors, newColors=="light", "gold")
newColors <- replace(newColors, newColors=="dark", "blue")
newColors <- replace(newColors, newColors=="neutral", "lightblue")
newColors <- replace(newColors, newColors=="anakin", "gold")

#assiging the new colors to the igraph object color variable
V(sw_network_o)$color <- newColors
V(sw_network_o)$color
##  [1] "gold"      "gold"      "lightblue" "lightblue" "gold"     
##  [6] "lightblue" "lightblue" "lightblue" "gold"      "gold"     
## [11] "lightblue" "gold"      "gold"      "lightblue" "lightblue"
## [16] "gold"      "gold"      "lightblue" "gold"      "gold"     
## [21] "blue"      "blue"      "gold"      "gold"      "lightblue"
## [26] "gold"      "blue"      "gold"      "blue"      "blue"     
## [31] "blue"      "lightblue" "gold"      "gold"      "blue"     
## [36] "gold"      "gold"      "gold"

We can add more information to our igraph object to make the plot of our network more informative: for example, we can assign various sizes to nodes, based on their degree, and modify the thickness (width) of edges based on their weights. Last but not least, we can display the labels to all the nodes (we can also modify the size of the labels based on the degree of each node).

5 Analysis

We can run a number of analytical procedures on our graph, results of which will allow us to learn new things about our network data. We will cover only the major elements. For more, you can consult: https://kateto.net/networks-r-igraph (also, see References below).

5.1 Centrality Measures

To put simply, the more connections a node has, the more central is its place (that does not really work for ego-networks though). Yet there is a variety of way to calculate that. For example: 1) we can simply count the number of connections; 2) we can also factor in the weights of all connections; 3) we can also factor in the degrees of connected nodes, etc. Things get more complicated in directed graphs…

5.1.1 Centrality / Node degree

Degree is the number of edges that a node has; in directed graphs nodes have in- and out-degrees, which are the some of incoming and outgoing edges. The function degree() has a mode of in for in-degree, out for out-degree, and all or total for total degree.

deg <- degree(sw_network, mode="all")
sort(deg, decreasing = T)[1:15]
##      ANAKIN (DW)            PADME          OBI-WAN          QUI-GON 
##               40               34               32               27 
##          JAR JAR   PALPATINE (DS)      BAIL ORGANA      NUTE GUNRAY 
##               24               21               17               16 
##             YODA            C-3PO       MACE WINDU            R2-D2 
##               16               15               13               11 
## COUNT DOOKU (DT)   CAPTAIN PANAKA     KI-ADI-MUNDI 
##               10                9                9
  1. Who have the highest degrees in the original trilogy?
deg_o <- degree(sw_network_o, mode="all")
sort(deg_o, decreasing = T)[1:15]
##           LUKE          C-3PO           LEIA            HAN          R2-D2 
##             42             31             31             28             23 
##      CHEWBACCA    ANAKIN (DW)          LANDO        OBI-WAN          WEDGE 
##             22             20             17             11              9 
##          BIGGS     MON MOTHMA ADMIRAL ACKBAR      BOBA FETT     RED LEADER 
##              8              8              7              7              7
set.seed(1)
plot(sw_network,
     vertex.size=deg/5,
     vertex.label.cex=0.5,
     vertex.color="white",
     layout=layout_with_fr,
     vertex.label.color="black")

hist(deg, main="Distribution of Node Degrees")

  1. Generate graph for the original trilogy with nodes sized by their degrees
set.seed(1)
plot(sw_network_o,
     vertex.size=deg_o/5,
     vertex.label.cex=0.5,
     vertex.color="white",
     layout=layout_with_fr,
     vertex.label.color="black")

hist(deg_o, main="Distribution of Node Degrees")

# normalized: leveling centrality score by dividing by the theoretical maximum
degree <- degree(sw_network, mode="all")
sort(degree, decreasing = TRUE)
##          ANAKIN (DW)                PADME              OBI-WAN 
##                   40                   34                   32 
##              QUI-GON              JAR JAR       PALPATINE (DS) 
##                   27                   24                   21 
##          BAIL ORGANA          NUTE GUNRAY                 YODA 
##                   17                   16                   16 
##                C-3PO           MACE WINDU                R2-D2 
##                   15                   13                   11 
##     COUNT DOOKU (DT)       CAPTAIN PANAKA         KI-ADI-MUNDI 
##                   10                    9                    9 
##              KITSTER           SIO BIBBLE                 SHMI 
##                    9                    8                    8 
##                JABBA      SENATOR ASK AAK             RIC OLIE 
##                    7                    7                    7 
##            BOSS NASS               POGGLE              SEBULBA 
##                    6                    6                    6 
##              SUN RIT                WATTO                 BERU 
##                    6                    6                    5 
##        CAPTAIN TYPHO               CLIEGG                 OWEN 
##                    5                    5                    5 
##                 RABE                 WALD                TC-14 
##                    5                    5                    5 
## CLONE COMMANDER CODY           DARTH MAUL             FANG ZAR 
##                    4                    4                    4 
##         GENERAL CEEL         GIDDEAN DANU           JANGO FETT 
##                    4                    4                    4 
##                JOBAL           MON MOTHMA                 RUNE 
##                    4                    4                    4 
##                RUWEE              VALORUM                 SOLA 
##                    4                    4                    4 
##              TAUN WE            BOBA FETT          BRAVO THREE 
##                    4                    3                    3 
##            BRAVO TWO     CAPTAIN ANTILLES               DOFINE 
##                    3                    3                    3 
##            FODE/BEED     GENERAL GRIEVOUS               GREEDO 
##                    3                    3                    3 
##                 JIRA         ORN FREE TAA              TEY HOW 
##                    3                    3                    3 
##              LAMA SU                 PK-4             ODD BALL 
##                    2                    2                    2 
## CLONE COMMANDER GREE              TARPALS             PLO KOON 
##                    1                    1                    1 
##           TION MEDON 
##                    1
centr_degree(sw_network, mode="all", normalized=T)
## $res
##  [1] 40 17  5  3  6  3  3 15  3  9  5  5  4  1 10  4  3  4  3  4  3  4  3
## [24]  7  4 24  3  4  9  9  2 13  4 16 32  3  5 34  2  6 27 11  5  4  4  6
## [47]  7  8  4 16  2 21  7  8  4  6  5  6  4  5  3  1  1  1
## 
## $centralization
## [1] 0.5109127
## 
## $theoretical_max
## [1] 4032

5.1.2 Eigenvector Centrality

Eigenvector centrality takes into considerartion the degree of connected nodes. In other words, while centrality of certain two nodes may be the same (say, they are each connected to three other nodes), their eigenvector centrality will be different because one node is connected to nodes with degrees 3, while another—to nodes with degrees 1; the first node will have higher eigenvector centrality value.

sw_network_eigen <- sw_network

eigenCent <- eigen_centrality(sw_network_eigen)$vector
sort(eigenCent,decreasing=TRUE)[1:10]
##    ANAKIN (DW)        OBI-WAN          PADME        QUI-GON          R2-D2 
##      1.0000000      0.8047439      0.7127653      0.6533549      0.5916439 
##        JAR JAR           YODA PALPATINE (DS)          C-3PO     MACE WINDU 
##      0.4164547      0.2833328      0.2544541      0.2445762      0.2126134
plot(sort(eigenCent,decreasing=TRUE))
points(x=7, y=sort(eigenCent,decreasing=TRUE)[7], col="red", pch=19)
points(x=15, y=sort(eigenCent,decreasing=TRUE)[15], col="red", pch=19)

# ANSWER: ANAKIN
  1. Who has the highest eigenvector centrality in the original trilogy?
sw_network_eigen <- sw_network_o

eigenCent_o <- eigen_centrality(sw_network_eigen)$vector
sort(eigenCent_o,decreasing=TRUE)[1:10]
##         HAN        LEIA       C-3PO   CHEWBACCA        LUKE       R2-D2 
##  1.00000000  0.87574951  0.86437448  0.82721898  0.70977616  0.56121865 
##       LANDO     OBI-WAN ANAKIN (DW)     RIEEKAN 
##  0.24853920  0.19178238  0.06438129  0.04211306
plot(sort(eigenCent_o,decreasing=TRUE))
points(x=7, y=sort(eigenCent_o,decreasing=TRUE)[7], col="red", pch=19)
points(x=15, y=sort(eigenCent_o,decreasing=TRUE)[15], col="red", pch=19)

# ANSWER: HAN

NB: Here is another way to color nodes:

bins <- unique(quantile(eigenCent, seq(0,1,length.out=50)))
vals <- cut(eigenCent, bins, labels=FALSE, include.lowest=TRUE)
colorVals <- rev(heat.colors(length(bins)))[vals]
V(sw_network_eigen)$color <- colorVals
## Warning in vattrs[[name]][index] <- value: number of items to replace is
## not a multiple of replacement length
set.seed(1)
plot(sw_network_eigen,
     vertex.label=NA,
     layout=layout_with_fr,
     vertex.size=5,
     main="Eigenvector")

5.2 Betweenness centrality

The vertex and edge betweenness are (roughly) defined by the number of geodesics (shortest paths) going through a vertex or an edge.

betweenCent <- betweenness(sw_network, directed=F)
cor(betweenCent,eigenCent)
## [1] 0.7780195
sort(betweenCent,decreasing=TRUE)[1:10]
##  ANAKIN (DW)      OBI-WAN        PADME      JAR JAR  NUTE GUNRAY 
##     416.9652     396.8997     328.3160     315.4231     238.2202 
##         YODA      QUI-GON   JANGO FETT        C-3PO GENERAL CEEL 
##     180.1230     163.2874     122.2740     121.7576     121.4692
plot(sort(betweenCent,decreasing=TRUE))
points(x=11, y=sort(betweenCent,decreasing=TRUE)[11], col="red", pch=19)
points(x=20, y=sort(betweenCent,decreasing=TRUE)[20], col="red", pch=19)

sw_network_bw <- sw_network

betweenCent <- betweenness(sw_network_bw, directed=F)
bins <- unique(quantile(betweenCent, seq(0,1,length.out=50)))
vals <- cut(betweenCent, bins, labels=FALSE, include.lowest=TRUE)
colorVals <- rev(heat.colors(length(bins)))[vals]
V(sw_network_bw)$color <- colorVals

set.seed(1)
plot(sw_network_bw,
     vertex.label=NA,
     layout=layout_with_fr,
     vertex.size=5,
     main="Betweenness")

Nodes that have relatively low eigenvector centrality and relatively high betweenness centrality are often called gate-keepers: connecting [rather] disjoint parts of the graph, these nodes function purely as links, rather than central hubs. (Keep in mind, that such nodes are not necessarily binary and opten depend on cut off values).

betweenCent <- betweenness(sw_network)
eigenCent <- evcent(sw_network)$vector
colorVals <- rep("white", length(betweenCent))
# the values are taken experimentally from graphs above. Can you find that?
colorVals[which(
  eigenCent <= sort(eigenCent,decreasing=TRUE)[7] &
    betweenCent > sort(betweenCent,decreasing=TRUE)[11])
          ] <- "red" # vertices which connect disjoint parts of the graph
V(sw_network)$color <- colorVals

set.seed(1)
plot.igraph(sw_network,
            layout=layout_with_fr,
            #vertex.label=NA,
            vertex.size=5)

  1. Who are the gate-keepers in the prequel?

General Ceel, Nute Gunray, C-3PO, Yoda and Jango Fett

  1. Who are the gate-keepers in the original trilogy?
betweenCent <- betweenness(sw_network_o)
eigenCent <- evcent(sw_network_o)$vector
colorVals <- rep("white", length(betweenCent))
# the values are taken experimentally from graphs above. Can you find that?
colorVals[which(
  eigenCent <= sort(eigenCent,decreasing=TRUE)[7] &
    betweenCent > sort(betweenCent,decreasing=TRUE)[11])
          ] <- "red" # vertices which connect disjoint parts of the graph
V(sw_network_o)$color <- colorVals

set.seed(1)
plot.igraph(sw_network_o,
            layout=layout_with_fr,
            #vertex.label=NA,
            vertex.size=5)

# ANSWER: Biggs, Wedge, Admiral Ackbar, Mon Mothma and Anakin (DW)

5.3 Cliques

Cliques are interconnected components within networks. cliques() finds all cliques in a given network. Parameters min= and max= determine the number of nodes for cliques to find. For more details, try ?cliques.

swCliques <- cliques(sw_network, min=9)
swCliques
## [[1]]
## + 9/64 vertices, named, from bf8b291:
## [1] ANAKIN (DW)    BAIL ORGANA    JAR JAR        MACE WINDU    
## [5] OBI-WAN        PADME          QUI-GON        YODA          
## [9] PALPATINE (DS)
## 
## [[2]]
## + 9/64 vertices, named, from bf8b291:
## [1] ANAKIN (DW)    BAIL ORGANA    C-3PO          OBI-WAN       
## [5] PADME          QUI-GON        R2-D2          YODA          
## [9] PALPATINE (DS)

The following code extracts only the largest cliques (the result is the same as above):

lCliques <- largest.cliques(sw_network)
lCliques
## [[1]]
## + 9/64 vertices, named, from bf8b291:
## [1] ANAKIN (DW)    PADME          QUI-GON        R2-D2         
## [5] C-3PO          BAIL ORGANA    OBI-WAN        YODA          
## [9] PALPATINE (DS)
## 
## [[2]]
## + 9/64 vertices, named, from bf8b291:
## [1] ANAKIN (DW)    PADME          QUI-GON        JAR JAR       
## [5] OBI-WAN        PALPATINE (DS) BAIL ORGANA    MACE WINDU    
## [9] YODA

There are two largest cliques, which we can plot like this:

c1 <- lCliques[[1]]
c2 <- lCliques[[2]]

g1 <- graph.full(length(c1))
V(g1)$name <- V(sw_network)$name[c1]

g2 <- graph.full(length(c2))
V(g2)$name <- V(sw_network)$name[c2]

g1 <- induced_subgraph(sw_network, lCliques[[1]])
g2 <- induced_subgraph(sw_network, lCliques[[2]])
par(mfrow=c(1,2))

set.seed(78)
plot(g1, layout=layout_nicely,
     #vertex.label=NA,
     vertex.color="orange",      
     vertex.size=5)
set.seed(78)
plot(g2, layout=layout_nicely,
     #vertex.label=NA,
     vertex.color="orange",
     vertex.size=5)

For more details on cliques, check ?cliques.

  1. What are the largest cliques in the original trilogy?
lCliques <- largest.cliques(sw_network_o)
lCliques
## [[1]]
## + 8/38 vertices, named, from bf9bc69:
## [1] LUKE       C-3PO      LEIA       HAN        CHEWBACCA  LANDO     
## [7] MON MOTHMA R2-D2
c1 <- lCliques[[1]]

g1 <- graph.full(length(c1))
V(g1)$name <- V(sw_network_o)$name[c1]

g1 <- induced_subgraph(sw_network_o, lCliques[[1]])
set.seed(78)
plot(g1, layout=layout_nicely,
     #vertex.label=NA,
     vertex.color="orange",      
     vertex.size=5)

5.4 Communities

Communities are clusters of nodes which are densely connected with each other, while the density of connections among clusters is weaker than among the nodes that belong to individual clusters. Identification of communities is a tricky process because there are plenty algorithms which are used for this purpose and they all are based on different assumptions about networks.

igraph has the several community detection algorithms. These functions try to find communities, where a community is a set of nodes with many edges inside the community and few edges between outside it (i.e. between the community itself and the rest of the graph). For a nice summary of these algorithms can be found at https://stackoverflow.com/questions/9471906/; you can also read about them using help function, like ??cluster_walktrap.

  • cluster_walktrap()
  • cluster_spinglass()
  • cluster_leading_eigen()
  • cluster_edge_betweenness()
  • cluster_fast_greedy()
  • cluster_louvain()
  • cluster_label_prop()
  • cluster_infomap()

(Note: cluster_leading_eigen() does not seem to be suitable for our Star Wars network—it throws an error, so we will skip it.)

Generally, the graph/network must be undirected for running such algorithms (same for cliques); any graph can be made undirected with simplify(), or re-create an igraph object with option directed=FALSE.

Let’s try them all. We will use set.seed(1) and layout_with_dh() for generating each graph, which generate networks of the same shape allowing us to compare the outputs much easier.

test <- sw_network

cluster_walktrap = cluster_walktrap(test)
cluster_spinglass = cluster_spinglass(test)
#cluster_leading_eigen = cluster_leading_eigen(test) # this a
cluster_edge_betweenness = cluster_edge_betweenness(test)
## Warning in cluster_edge_betweenness(test): At community.c:460 :Membership
## vector will be selected based on the lowest modularity score.
## Warning in cluster_edge_betweenness(test): At community.c:467 :Modularity
## calculation with weighted edge betweenness community detection might not
## make sense -- modularity treats edge weights as similarities while edge
## betwenness treats them as distances
cluster_fast_greedy = cluster_fast_greedy(test)
cluster_louvain = cluster_louvain(test)
cluster_label_prop = cluster_label_prop(test)
cluster_infomap <- cluster_infomap(test) # for directed graphs

These functions generate objects of the class communites; we can extract information on communities with line like cluster_walktrap$membership—these vector can be attached to our nodes table and we can thus collect all analytical information into one table and reuse it later.

Now we can generate graphs of communities identified with all these algorithms. Now, in plot() we first add a communities object, then igraph object, then everything else. NB: deparse(substitute(cluster_infomap) converts the name of the variable into a string so that we could plot it on the graph.

par(mfrow=c(1,2), mar=c(1,1,1,1))

set.seed(1)
plot(cluster_walktrap, test, layout=layout_with_dh, vertex.label=NA, vertex.size=5)
title(deparse(substitute(cluster_walktrap)), cex.main=2)

set.seed(1)
plot(cluster_spinglass, test, layout=layout_with_dh, vertex.label=NA, vertex.size=5)
title(deparse(substitute(cluster_spinglass)), cex.main=2)

set.seed(1)
plot(cluster_edge_betweenness, test, layout=layout_with_dh, vertex.label=NA, vertex.size=5)
title(deparse(substitute(cluster_edge_betweenness)), cex.main=2)

set.seed(1)
plot(cluster_fast_greedy, test, layout=layout_with_dh, vertex.label=NA, vertex.size=5)
title(deparse(substitute(cluster_fast_greedy)), cex.main=2)

set.seed(1)
plot(cluster_louvain, test, layout=layout_with_dh, vertex.label=NA, vertex.size=5)
title(deparse(substitute(cluster_louvain)), cex.main=2)

set.seed(1)
plot(cluster_label_prop, test, layout=layout_with_dh, vertex.label=NA, vertex.size=5)
title(deparse(substitute(cluster_label_prop)), cex.main=2)

set.seed(1)
plot(cluster_infomap, test, layout=layout_with_dh, vertex.label=NA, vertex.size=5)
title(deparse(substitute(cluster_infomap)), cex.main=2)

  1. Generate communities for the original trilogy. What are they, who are their members? Feel free to share any thoughts on this issue.
test2 <- sw_network_o

cluster_walktrap2 = cluster_walktrap(test2)
cluster_spinglass2 = cluster_spinglass(test2)
#cluster_leading_eigen2 = cluster_leading_eigen(test2) # this a
cluster_edge_betweenness2 = cluster_edge_betweenness(test2)
## Warning in cluster_edge_betweenness(test2): At community.c:460 :Membership
## vector will be selected based on the lowest modularity score.
## Warning in cluster_edge_betweenness(test2): At community.c:467 :Modularity
## calculation with weighted edge betweenness community detection might not
## make sense -- modularity treats edge weights as similarities while edge
## betwenness treats them as distances
#cluster_fast_greedy2 = cluster_fast_greedy(test2)
cluster_louvain2 = cluster_louvain(test2)
cluster_label_prop2 = cluster_label_prop(test2)
cluster_infomap2 <- cluster_infomap(test2) # for directed graphs
par(mfrow=c(1,2), mar=c(1,1,1,1))

set.seed(1)
plot(cluster_walktrap2, test2, layout=layout_with_dh, vertex.size=5)
title(deparse(substitute(cluster_walktrap)), cex.main=2)

set.seed(1)
plot(cluster_spinglass2, test2, layout=layout_with_dh, vertex.size=5)
title(deparse(substitute(cluster_spinglass)), cex.main=2)

set.seed(1)
plot(cluster_edge_betweenness2, test2, layout=layout_with_dh, vertex.size=5)
title(deparse(substitute(cluster_edge_betweenness)), cex.main=2)

set.seed(1)
plot(cluster_louvain2, test2, layout=layout_with_dh, vertex.size=5)
title(deparse(substitute(cluster_louvain)), cex.main=2)

set.seed(1)
plot(cluster_label_prop2, test2, layout=layout_with_dh, vertex.size=5)
title(deparse(substitute(cluster_label_prop)), cex.main=2)

set.seed(1)
plot(cluster_infomap2, test2, layout=layout_with_dh, vertex.size=5)
title(deparse(substitute(cluster_infomap)), cex.main=2)

# Alternative visualizations

5.5 ggraph

ggraph library is a part of the ggplot family and offers offers comparable options for plotting graphs, relying on the concept of the grammar of graphics.

library(ggraph)
library(ggrepel)
library(ggalt)

set.seed(786)

sw_network_ggraph <- sw_network
V(sw_network_ggraph)$cluster_louvain <- as.character(cluster_louvain$membership)

vDegree <- degree(sw_network_ggraph, mode="all")
V(sw_network_ggraph)$degree <- vDegree

swNetworkPlot <- ggraph(sw_network_ggraph, 'igraph', algorithm = 'with_fr') +
  geom_encircle(s_shape=.1, expand=0.01, alpha=.25, col="black", aes(x=x, y=y, group=cluster_louvain, fill=cluster_louvain))+   geom_edge_link(aes(alpha=weight), width=0.5) +
  geom_node_point(aes(color=cluster_louvain, size=degree), alpha=1) +
  geom_node_label(aes(label=name), color="black", size=3, repel=TRUE, alpha=0.75) +
  #ggforce::theme_no_axes() +
  theme_graph()+
  scale_size_continuous(range=c(0.1,10), limits=c(1,max(V(sw_network_ggraph)$degree)))

swNetworkPlot
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

#ggsave(file=paste0("practiceGraph_swNetwork_TEST.png"),plot=swNetworkPlot,dpi=600,width=15,height=9)
  1. Generate a similar graph for the original trilogy, but use a different layout:
set.seed(786)

sw_network_ggraph2 <- sw_network_o
V(sw_network_ggraph2)$cluster_walktrap2 <- as.character(cluster_louvain$membership)
## Warning in vattrs[[name]][index] <- value: number of items to replace is
## not a multiple of replacement length
vDegree <- degree(sw_network_ggraph, mode="all")
V(sw_network_ggraph2)$degree <- vDegree
## Warning in vattrs[[name]][index] <- value: number of items to replace is
## not a multiple of replacement length
swNetworkPlot2 <- ggraph(sw_network_ggraph2, 'igraph', algorithm = 'with_fr') +
  geom_encircle(s_shape=.1, expand=0.01, alpha=.25, col="black", aes(x=x, y=y, group=cluster_walktrap2, fill=cluster_walktrap2))+   geom_edge_link(aes(alpha=weight), width=0.5) +
  geom_node_point(aes(color=cluster_walktrap2, size=degree), alpha=1) +
  geom_node_label(aes(label=name), color="black", size=3, repel=TRUE, alpha=0.75) +
  #ggforce::theme_no_axes() +
  theme_graph()+
  scale_size_continuous(range=c(0.1,10), limits=c(1,max(V(sw_network_ggraph)$degree)))

swNetworkPlot2
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

#ggsave(file=paste0("practiceGraph_swNetwork_TEST.png"),plot=swNetworkPlot,dpi=600,width=15,height=9)

Just for reference, these are layouts from igraph that can be used as arguments for algorithm= (i.e. argument="with_kk").

 [1] "as_bipartite"  "as_star"       "as_tree"       "components"    "in_circle"    
 [6] "nicely"        "on_grid"       "on_sphere"     "randomly"      "with_dh"      
[11] "with_drl"      "with_fr"       "with_gem"      "with_graphopt" "with_kk"      
[16] "with_lgl"      "with_mds"      "with_sugiyama"

ggraph also has additional layouts: layout_igraph_auto, layout_igraph_circlepack, layout_igraph_dendrogram, layout_igraph_hive, layout_igraph_linear, layout_igraph_manual, layout_igraph_partition. See, https://www.rdocumentation.org/packages/ggraph/versions/1.0.2/.

In addition to igraph layouts, ggraph has a series of its own. Here is a nice example of an arc diagram:

swNetworkPlot <- ggraph(sw_network_ggraph, layout="linear") + 
  geom_edge_arc(aes(width = weight), alpha = 0.8) + 
  scale_edge_width(range = c(0.2, 4)) +
  geom_node_text(aes(label = name), angle=90, size=3, hjust=0, nudge_x=0.0, nudge_y=-7) +
  labs(edge_width = "Interactions") +
  theme_graph()

swNetworkPlot
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

#ggsave(file=paste0("practiceGraph_swNetwork_TEST.png"),plot=swNetworkPlot,dpi=600,width=20,height=7)

The same arc diagram, but plotted differently (added coord_flip()), and with some more improvements:

yNudge = 8

swNetworkPlot <- ggraph(sw_network_ggraph, layout="linear") + 
  geom_edge_arc(aes(width = weight, y=y+yNudge, yend=yend+yNudge), alpha = 0.8) + 
  geom_node_point(aes(color=cluster_louvain, size=degree)) +
  #geom_node_point(aes(size=degree, y=y+yNudge-1)) +
  #geom_node_point(aes(size=degree, y=y-yNudge), color="white") +
  geom_node_text(aes(label = name, y=y+yNudge-1), size=3, hjust=1, nudge_y = -1) +
  scale_edge_width(range = c(0.2, 4)) +
  coord_flip()+
  labs(edge_width = "Interactions") +
  theme_graph()

swNetworkPlot
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

#ggsave(file=paste0("practiceGraph_swNetwork_TEST.png"),plot=swNetworkPlot,dpi=600,width=10,height=17)
  1. Analyze the code for the graph above: you can comment/uncomment lines of code to check what they are adding or modifying. Explain each line of code below.

geom_edge_arc : this function adds the interactions (lines) between the nodes geom_node_point : this function adds the node points that allow to visiualize degrees and culusters geom_node_text : this function adds the labels for the nodes scale_edge_width: this function scales the interaction lines to a custom level for better visualization coord_flip : flips the graph to vertical view instead of horizontal the last two function add headers and chooses the theme of the plot

  1. Generate a similar graph for the original trilogy. Add whatever modifications you consider appropriate and valuable.
yNudge = 8

swNetworkPlot2 <- ggraph(sw_network_ggraph2, layout="linear") + 
  geom_edge_arc(aes(width = weight, y=y+yNudge, yend=yend+yNudge), alpha = 0.8) + 
  #geom_node_point(aes(color=cluster_louvain, size=degree)) +
  geom_node_point(aes(size=degree, y=y+yNudge-1)) +
  geom_node_point(aes(size=degree, y=y-yNudge), color="white") +
  geom_node_text(aes(label = name, y=y+yNudge-1), size=3, hjust=1, nudge_y = -1) +
  scale_edge_width(range = c(0.2, 4)) +
  coord_flip()+
  labs(edge_width = "Interactions") +
  theme_graph()

swNetworkPlot2
## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

## Warning in grid.Call(C_textBounds, as.graphicsAnnot(x$label), x$x, x$y, :
## font family not found in Windows font database

#ggsave(file=paste0("practiceGraph_swNetwork_TEST.png"),plot=swNetworkPlot,dpi=600,width=10,height=17)

For more on layouts, see this library creator’s website. You can also find examples of how to work with nodes and edges.

5.6 visNetwork: interactive graphs

Using visNetwork library we can build an interactive network right from our igraph object. In the example below—which is generated with just one line of code (!)—you can zoom in and out, pan around and click on nodes to see what other nodes they are connected to.

library(visNetwork)
visIgraph(sw_network_prequel_bare)

visNetwork also offers control over most parameters. Here is another example. First, we need to convert our igraph object into visNetwork data structure:

So, let’s add different nework parameters and create our network with an interactive plot.

NB: if your network is large, it may take a lot of time to generate a graph. You can use the following parameters to speed up the graph generation (in the code below these lines are commented out):

  • visPhysics(stabilization = FALSE) : turns off real-time stabilization of nodes in the graph;
  • visEdges(smooth = FALSE) : uses straight edges instead of curves;
  • visIgraphLayout(layout = "layout_in_circle") : uses igraph to pregenerate the layout.
data <- toVisNetworkData(sw_network_ggraph)
nodes <- data[[1]]
edges <- data[[2]]

library(RColorBrewer)
#nodes$color <- brewer.pal(12, "Set3")[as.factor(nodes$cluster_louvain)]
nodes$color <- brewer.pal(12, "Set3")[as.factor(nodes$SIDE)]

nodes$shape <- "dot" 
nodes$shadow <- TRUE # Nodes will drop shadow
nodes$title <- nodes$label # Text on click
nodes$size <- round(nodes$degree/2, 3) # Node size
nodes$borderWidth <- 2 # Node border width
nodes$color.border <- "black"

edges$width <- round(edges$weight/2,3)

set.seed(1)
visNetwork(nodes, edges, width="100%", height="750px") %>%
  #visPhysics(stabilization = FALSE) %>% 
  #visEdges(smooth = FALSE) %>%
  visIgraphLayout(layout = "layout_with_fr") %>%
  visOptions(highlightNearest = TRUE,
             selectedBy = "cluster_louvain",
             nodesIdSelection = TRUE)

For more details on visNetwork: https://datastorm-open.github.io/visNetwork/. This example is based on: https://wesslen.github.io/text%20mining/topic-networks/.

  1. Generate an interactive network for the original trilogy? Provide a short description of your visualization.
data <- toVisNetworkData(sw_network_ggraph2)
nodes <- data[[1]]
edges <- data[[2]]

library(RColorBrewer)
#nodes$color <- brewer.pal(12, "Set3")[as.factor(nodes$cluster_louvain)]
nodes$color <- brewer.pal(12, "Set3")[as.factor(nodes$SIDE)]

nodes$shape <- "dot" 
nodes$shadow <- TRUE # Nodes will drop shadow
nodes$title <- nodes$label # Text on click
nodes$size <- round(nodes$degree/2, 3) # Node size
nodes$borderWidth <- 2 # Node border width
nodes$color.border <- "black"

edges$width <- round(edges$weight/2,3)

set.seed(1)
visNetwork(nodes, edges, width="100%", height="750px") %>%
  #visPhysics(stabilization = FALSE) %>% 
  #visEdges(smooth = FALSE) %>%
  visIgraphLayout(layout = "layout_with_fr") %>%
  visOptions(highlightNearest = TRUE,
             selectedBy = "cluster_louvain",
             nodesIdSelection = TRUE)
## Warning in visOptions(., highlightNearest = TRUE, selectedBy =
## "cluster_louvain", : Can't find 'cluster_louvain' in node data.frame

6 Reference: parameters for plotting with igraph

NODES
vertex.color Node color
vertex.frame.color Node border color
vertex.shape One of “none”, “circle”, “square”, “csquare”, “rectangle” “crectangle”, “vrectangle”, “pie”, “raster”, or “sphere”
vertex.size Size of the node (default is 15)
vertex.size2 The second size of the node (e.g. for a rectangle)
vertex.label Character vector used to label the nodes
vertex.label.family Font family of the label (e.g.“Times”, “Helvetica”)
vertex.label.font Font: 1 plain, 2 bold, 3, italic, 4 bold italic, 5 symbol
vertex.label.cex Font size (multiplication factor, device-dependent)
vertex.label.dist Distance between the label and the vertex
vertex.label.degree The position of the label in relation to the vertex, where 0 right, “pi” is left, “pi/2” is below, and “-pi/2” is above
EDGES
edge.color Edge color
edge.width Edge width, defaults to 1
edge.arrow.size Arrow size, defaults to 1
edge.arrow.width Arrow width, defaults to 1
edge.lty Line type, could be 0 or “blank”, 1 or “solid”, 2 or “dashed”, 3 or “dotted”, 4 or “dotdash”, 5 or “longdash”, 6 or “twodash”
edge.label Character vector used to label edges
edge.label.family Font family of the label (e.g.“Times”, “Helvetica”)
edge.label.font Font: 1 plain, 2 bold, 3, italic, 4 bold italic, 5 symbol
edge.label.cex Font size for edge labels
edge.curved Edge curvature, range 0-1 (FALSE sets it to 0, TRUE to 0.5)
arrow.mode Vector specifying whether edges should have arrows, possible values: 0 no arrow, 1 back, 2 forward, 3 both
OTHER
margin Empty space margins around the plot, vector with length 4
frame if TRUE, the plot will be framed
main If set, adds a title to the plot
sub If set, adds a subtitle to the plot

7 Review: Layout algorithms

Let’s take another look at the layout algorithms. We can use coloring of nodes (based on eigenvector centrality: high is red, low is white) to visually check how layouts work.

  1. Take a close look at the resultant graphs and compare them. Can you discern any pattern? Describe it.

your answer

layouts <- grep("^layout_", ls("package:igraph"), value=TRUE)[-1] 
# Remove layouts that do not apply to our graph.
layouts <- layouts[!grepl("bipartite|merge|norm|sugiyama", layouts)]

par(mfrow=c(1,2), mar=c(1,1,1,1))
for (layout in layouts) {
  l <- do.call(layout, list(sw_network_eigen))
  set.seed(1)
  plot(sw_network_eigen, edge.arrow.mode=0, layout=l, main=layout, vertex.size=7, vertex.label=NA, cex.main=2) }

8 Code snippets

8.1 Comparing two vectors

Writing the chunks of code above, I made a mistake and loaded unsplit edges data into the variable for split edges data. As a result, when I tried to create an igraph object, R threw an error complaining that there are more nodes in the edges data than there are in the vertices data. I did not know where exactly I made a mistake, so I needed to figure that out so that I could fix it, for this I needed to know what are the values that are missing. Function setdiff(X1, x2) is very helpful as it can compare vectors and dataframes and show distinct values.

Let’s compare nodes in two slightly different vectors: let’s take our vector with nodes for the prequel and create another version of it with an extra node that does not fit:

n_prequel <- unique(c(sw_e_prequel$source, sw_e_prequel$target))
n_prequel1 <- c(n_prequel, "Thanos")

Now, we can compare them. The following code will show us distinct values in n_prequel vs. n_prequel1:

setdiff(n_prequel, n_prequel1)
## character(0)

… and vice versa:

setdiff(n_prequel1, n_prequel)
## [1] "Thanos"

… and this is how we can check if two vectors are exactly the same:

v1 <- n_prequel
v2 <- n_prequel

setdiff(v1, v2)
## character(0)
setdiff(v2, v1)
## character(0)

9 References

10 Data